Additions to wiki - Model Interpretation Documentation#75
Additions to wiki - Model Interpretation Documentation#75TimCookCountyDS wants to merge 33 commits into
Conversation
copy-edit rmv training
copy edit 2
copy edits
outline edit
update broke assessment metrics links
Edit links
edit hyperlinks 2
close parens
Save link in readme
|
Could we resolve merge conflicts and get at least a tiny description for this PR before review? |
|
My main thought is if we want to include key aspects of this in the checklist? |
@Damonamajor - definitely aligned with that sentiment- My thought is we could just link this document in the checklist- with reference to specific sections? I can go ahead and do that, and then call this finished. |
|
|
||
| ### A. Balance Tests | ||
|
|
||
| *(See the "Statistical Tests" section of the model performance report.)* In a perfectly matched sample, no feature would predict inclusion/exclusion of a property in the sales-sample. Any feature that predicts inclusion in the sales set at a level greater than chance (statistical significance) suggests that this feature is over-or under-represented in the sample and will likely bias your results. (This is especially the case for features that also turn out to have high shap values in your results). To check this, we run a simple logistic regression predicting the likelihood-of-a-sale, given a property's features. The resulting p values (for each feature in the report) tells you that a feature predicts inclusion in the sample at a level greater than expected-due-to-chance, while the Beta value gives you a relative sense of the weight (importance) and direction (include vs exclude) of that feature. (In our report, asterisks, represent statistically significant predictors). (Low p-values suggest statistical significance, high magnitudes for the Betas suggest a large impact). When a feature is predictive of inclusion in the sample, this means that your sample is likely biased towards properties with this feature, and may thus value these, or other properties inaccurately. |
There was a problem hiding this comment.
I also like the note of where we can find this. It may be a bit too nitty to do this for every section, but maybe under the big headers, note which sections of reports we can find the different interpretations.
fixed link
fix bias variance links
ccao-jardine
left a comment
There was a problem hiding this comment.
This is solid, thanks to everyone who has written and pitched in to review! I'm here to add my own comments. Nitpicks are optional; anything not marked as a nitpick I'd like to discuss or resolve.
Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>
Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>
Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>
Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>
Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>
Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>
corrected inaccurate links for lgbm missingness handling
Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>
fixed progress and poverty link
Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>
Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>
|
@ccao-jardine - feedback incorporated. Let me know if there's anything else needed before I merge? |
wrridgeway
left a comment
There was a problem hiding this comment.
Looking good! I'll do another review once we've added something like a terms section to make this a bit easier to read through.
| **Overview:** | ||
|
|
||
| 1. Assessing how representative your sales sample is of the assessment set. | ||
| - a. Balance tests | ||
| - b. Visual inspection | ||
| - c. Not missing at random | ||
| - d. Domain specific approach | ||
|
|
||
| 2. Noting any real-world housing market changes that may impact your model, and/or interactions between data and model that may affect your results (model drift, data drift). | ||
|
|
||
| 3. Interpreting model performance (evaluating machine learning and assessment metrics). | ||
|
|
There was a problem hiding this comment.
| **Overview:** | |
| 1. Assessing how representative your sales sample is of the assessment set. | |
| - a. Balance tests | |
| - b. Visual inspection | |
| - c. Not missing at random | |
| - d. Domain specific approach | |
| 2. Noting any real-world housing market changes that may impact your model, and/or interactions between data and model that may affect your results (model drift, data drift). | |
| 3. Interpreting model performance (evaluating machine learning and assessment metrics). |
There is an "Outline" button next to markdown files that already provides this feature in a really clean way:
Or, if we're committed to this outline, i'd link to the sections through it using section links.
There was a problem hiding this comment.
I would replace this outline with a "terms" section and then try to clean up the constant switching between population, sample, sales, and assessment. It's a lot of parentheses and extra language that we could get out of the way super quick and then use a couple small, consistent terms throughout. I tried to clean this up in the word doc but perhaps I made it worse.
Useful terms
- Sample: the universe of parcel sales we use to train and test our model
- Population: the universe of parcels that the model needs to value
etc...
There was a problem hiding this comment.
Great call. I think this is actually really important to be clear on- (population, sample, sales, and assessment.) - as it can be a source of confusion when discussing different model outputs and types of evaluation (especially with regard to differences between ml evaluation and domain specific evaluation).
Co-authored-by: William Ridgeway <10358980+wrridgeway@users.noreply.github.com>
No description provided.